NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Trans-Glasso: A Transfer Learning Approach to Precision Matrix Estimation

Zhao, Boxin; Ma, Cong; Kolar, Mladen (November 2024, Arxiv)

Full Text Available
Inconsistency of Cross-Validation for Structure Learning in Gaussian Graphical Models

Lyu, Zhao; Tai, Wai_Ming; Kolar, Mladen; Aragam, Bryon (May 2024, Proceedings of The 27th International Conference on Artificial Intelligence and Statistics)

Despite numerous years of research into the merits and trade-offs of various model selection criteria, obtaining robust results that elucidate the behavior of cross-validation remains a challenging endeavor. In this paper, we highlight the inherent limitations of cross-validation when employed to discern the structure of a Gaussian graphical model. We provide finite-sample bounds on the probability that the Lasso estimator for the neighborhood of a node within a Gaussian graphical model, optimized using a prediction oracle, misidentifies the neighborhood. Our results pertain to both undirected and directed acyclic graphs, encompassing general, sparse covariance structures. To support our theoretical findings, we conduct an empirical investigation of this inconsistency by contrasting our outcomes with other commonly used information criteria through an extensive simulation study. Given that many algorithms designed to learn the structure of graphical models require hyperparameter selection, the precise calibration of this hyperparameter is paramount for accurately estimating the inherent structure. Consequently, our observations shed light on this widely recognized practical challenge.
more » « less
Full Text Available
Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

https://doi.org/10.1145/3508029

Luo, Yuwei; Gupta, Varun; Kolar, Mladen (February 2022, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q,R, but unknown and non-stationary dynamics A_t, B_t. The sequence of dynamics matrices can be arbitrary, but with a total variation, V_T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of O(V_T^2/5 T^3/5 ). With piecewise constant dynamics, our algorithm achieves the optimal regret of O(sqrtST ) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of $$V_T$$. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.
more » « less
Full Text Available
A Nonconvex Framework for Structured Dynamic Covariance Recovery}

Tsai, Katherine; Kolar, Mladen; Koyejo, Oluwasanmi (January 2022, Journal of machine learning research)

We propose a flexible, yet interpretable model for high-dimensional data with time-varying second-order statistics, motivated and applied to functional neuroimaging data. Our approach implements the neuroscientific hypothesis of discrete cognitive processes by factorizing covariances into sparse spatial and smooth temporal components. Although this factorization results in parsimony and domain interpretability, the resulting estimation problem is nonconvex. We design a two-stage optimization scheme with a tailored spectral initialization, combined with iteratively refined alternating projected gradient descent. We prove a linear convergence rate up to a nontrivial statistical error for the proposed descent scheme and establish sample complexity guarantees for the estimator. Empirical results using simulated data and brain imaging data illustrate that our approach outperforms existing baselines.
more » « less
Full Text Available
Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model

https://doi.org/10.1080/01621459.2019.1689984

Lu, Junwei; Kolar, Mladen; Liu, Han (October 2020, Journal of the American Statistical Association)
null (Ed.)
Full Text Available
Kernel Meets Sieve: Post-Regularization Confidence Bands for Sparse Additive Model

Lu, Junwei; Kolar, Mladen; Liu, Han (January 2020, Journal of the American Statistical Association)

Full Text Available
Joint Gaussian graphical model estimation: A survey

https://doi.org/10.1002/wics.1582

Tsai, Katherine; Koyejo, Oluwasanmi; Kolar, Mladen (April 2022, WIREs Computational Statistics)

Abstract Graphs representing complex systems often share a partial underlying structure across domains while retaining individual features. Thus, identifying common structures can shed light on the underlying signal, for instance, when applied to scientific discovery or clinical diagnoses. Furthermore, growing evidence shows that the shared structure across domains boosts the estimation power of graphs, particularly for high‐dimensional data. However, building a joint estimator to extract the common structure may be more complicated than it seems, most often due to data heterogeneity across sources. This manuscript surveys recent work on statistical inference of joint Gaussian graphical models, identifying model structures that fit various data generation processes. This article is categorized under:Data: Types and Structure > Graph and Network DataStatistical Models > Graphical Models
more » « less
Distributed Stochastic Multi-Task Learning with Graph Regularization

Wang, Weiran; Wang, Jialei; Kolar, Mladen; Srebro, Nathan (February 2018, arXiv.org)

We propose methods for distributed graph-based multi-task learning that are based on weighted averaging of messages from other machines. Uniform averaging or diminishing stepsize in these methods would yield consensus (single task) learning. We show how simply skewing the averaging weights or controlling the stepsize allows learning different, but related, tasks on the different machines.
more » « less
Full Text Available
Distributed Multi-Task Learning with Shared Representation

Wang, Jialei; Kolar, Mladen; Srebro, Nathan (March 2016, arXiv.org)

We study the problem of distributed multitask learning with shared representation, where each machine aims to learn a separate, but related, task in an unknown shared low-dimensional subspaces, i.e. when the predictor matrix has low rank. We consider a setting where each task is handled by a different machine, with samples for the task available locally on the machine, and study communication-efficient methods for exploiting the shared structure.
more » « less
Full Text Available
Distributed Multi-Task Learning

Wang, Jialei; Kolar, Mladen; Srebro, Nathan (January 2016, Journal of machine learning research)

We consider the problem of distributed multitask learning, where each machine learns a separate, but related, task. Specifically, each machine learns a linear predictor in high-dimensional space, where all tasks share the same small support. We present a communication-efficient estimator based on the debiased lasso and show that it is comparable with the optimal centralized method.
more » « less
Full Text Available

Search for: All records